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Most web-based disease surveillance systems that give epidemic alerts are 
based on very large and unstructured data from various news sources, social 
media and online queries that are parsed by complex algorithms. This has the 
tendency to generate results that are so diverse and non-specific. When 
considered along with the fact that there are no existing standards for mining 
and analyzing data from the internet, the results or decisions reached based 


Keyword: on internet sources have been classified as low-quality. This paper proposes a 
web-based grassroots epidemic alert system that is based on data collected 
Alert specifically from primary health centers, hospitals and registered 
Disease laboratories. It takes a more traditional approach to indicator-based disease 
Epidemic surveillance as a step towards standardizing web-based disease surveillance. 
Health-care It makes use of a threshold value that is based on the third quartile 
Infectious (75" percentile) to determine the need to trigger the alarm for the onset of an 
Model epidemic. It also includes, for deeper analysis, demographic information. 
Surveillance Copyright © 2018 Institute of Advanced Engineering and Science. 
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1. INTRODUCTION 

In the last quarter of the year 2017, a rare disease known as Monkey Pox, broke out in Nigeria, a 
nation in the western part of Africa. The information about the disease outbreak got to the Nigeria Centre for 
Disease Control (NCDC) through the Niger Delta University Teaching Hospital (NDUTH), Okolobiri, 
Bayelsa State. By the end of the year, a total of forty three (43) cases spread across eight states in Nigeria 
have been confirmed. New cases of the Monkey Pox disease stopped being reported in Nigeria by the start of 
the year 2018. This was quite laudable as Nigeria uses traditional surveillance methods to watch out for 
disease outbreaks [1]. This case study reinforces the need to detect the outbreak of infectious diseases at the 
earliest stage, especially at this time when the world has become a “global village”. Thus, nations invest 
heavily on disease surveillance systems. The reason is not far-fetched, the outbreak of an infectious disease, 
if not contained at its earliest stage, could lead to catastrophic local, national and world-wide consequences. 
Economies can be brought to their knees by epidemics that could not be contained because it was not 
detected early. 

The internet has become a powerful tool for detecting epidemics at its earliest stage, it has made it 
possible to collate and deliver information on the progression of disease outbreaks, epidemics and in some 
cases, pandemics within days or even hours. The power of the internet is now being explored on a worldwide 
scale for disease surveillance. A lot of attention has been shifting lately to the possibilities embedded in the 
internet for web-based disease surveillance [2]. Today, a lot of web-based systems serve the world in various 
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languages and utilize data from news sources and social media to detect epidemics at their earliest stage. 
These web-based disease surveillance systems are sometimes restricted, semi-restricted or freely accessible 
to the public. 

Event-based surveillance systems usually utilize data from online sources. The data acquired could 
be moderated or aggregated automatically. Syndromic-based surveillance and indicator-based surveillance 
system make use of health data from healthcare providers, diagnostic laboratories and surveillance specialists 
in governmental organizations. There also exists a whole arsenal of web-based disease surveillance systems 
that give early alerts about the outbreak of diseases based on queries made by internet users. The main goal 
of all these web-based systems remains the early detection of an epidemic outbreak. 

The United States can boast of Program for Monitoring Emerging Diseases (ProMED-mail) and 
Epi-SPIDER for web-based disease surveillance and bio-security intelligence. Other well-known disease 
surveillance systems like Influenzanet (Europe), The Global, Public Health Intelligence Network - GPHIN 
(Canada), The Global Outbreak Alert and Response Network (GOARN) and Google Flu Trends are web- 
based. These web-based epidemic alert systems all make use of data mined from the internet and complex 
algorithms to analyze data for useful information on disease outbreaks. But most of these systems extract 
information from a large pool of data on a very large number of infections and thus have difficulty presenting 
critical information concisely and without ambiguities. The reason for this is simply the fact that till date, no 
sound methodology has been developed for measuring the relationship between data mined on health related 
issues from the internet and actual public health related issues like epidemics and pandemics. 

In this paper, a shift to the use of a more traditional approach that is web-based is proposed. It would 
make use of clinical data from primary health centers, diagnostic laboratories and hospitals. The data 
collected would be classified using the syndromic codes contained in the tenth revision of The International 
Classification of Diseases (ICD-10). A weekly percentile check forms the basis for determining if a disease 
has reached epidemic levels. Algorithms working at the background analyses information got from a network 
of primary health centers, hospitals and laboratories to generate graphically illustrated results about fifty (50) 
diseases including those on the World Health Organization (WHO) watch list. A threshold based on the third 
quartile (75th Percentile) for each week was to determine when to trigger the epidemic alert. Diseases on the 
watch list, when detected, trigger a special alarm. Information like demographics and location were also 
included for more detailed analysis. 


2. LITERATURE REVIEW 
2.1. The big argument 
The big argument about the true difference between the words, Epidemic and Pandemic was 

addressed by David M. Morens et al [3]. In a bid to describe pandemics, the researchers elucidated on eight 
key factors that tend to characterize a widespread infection as a pandemic. In the expository, the researchers 
surmised that a disease can be termed pandemic if: 
a. It covers a very large geographic area. Trans regional (greater than or equal to two adjacent regions of the 
world); inter regional (greater than or equal to two non-adjacent regions) and global. 
It can be traced from place to place as it progresses. 
It spreads explosively and has a high attack rate. 
The population has a minimal immunity to it. 
If it is novel (like HIV/AIDS), through a new strain of the same pathogen. 
If it is very contagious without specific regards to the means by which it is spread. 
If it is very severe. 

In all, the authors concluded that having the term Pandemic, defined as a large epidemic makes 
sense and avoids the pitfalls of inconsistency. In addition, the researchers suggested that the term pandemic 
be used for only infectious diseases. 


gqmMoanods 


2.2. Web-based disease surveillance and epidemic alert system: state-of-the-art 

Erini reviewed some of the latest technologies and tools used to carry out regional and global 
infectious disease surveillance [4]. A review on epidemic modeling was also done. The need to be able to 
quickly and efficiently classify previously unknown strains of pathogens that are responsible for emerging 
infectious diseases was stressed. Disease surveillance with the latest and most effective tools was also 
encouraged to ensure that novel and re-emerging infectious diseases do not attain epidemic/pandemic levels. 
The author emphasized the need for large-scale infectious disease surveillance networks, especially in the 
world of today that is fast becoming smaller due to “never-seen-before” bridges in communication and 
transportation gaps. The author highlighted event-based surveillance, web-based real-time surveillance, early 
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warning and alert response networks, infectious diseases modeling, social media and new technologies in 
pathogen discovery as the key drivers in the new age of disease surveillance. 

Collier, of the National Institute of Informatics, Tokyo, Japan did a survey on the importance of 
Epidemic Intelligence (EI). The focus of the researcher’s work was a survey on the use of artificial 
intelligence, social network, data mining tools and natural language processors to monitor the progress of 
disease outbreaks from predominantly unstructured data. He surmised that at the core of Epidemic 
Intelligence using unstructured data was the technology called Text Mining [5]. 

In their work, Jennifer L. Gardy et al shed light on the need to develop a disease surveillance system 
that is global in scope, works on the go and is genomics-informed especially iafter the Ebola and Zika 
epidemics. The researchers proposed a One Health System that is based on genomics diagnostics and 
epidemiology integration into existing disease surveillance systems [6]. The researchers envisioned a system 
that integrates human, animal and environmental health to proffer disease surveillance solutions to regions of 
the world with inadequate to non-existent laboratory facilities [7]. The researchers paid particular attention to 
how several epidemics like the Ebola and Zika progressed uninhibited for months and unnoticed by even the 
most advanced disease surveillance systems until they were discovered much later when they have grown to 
epic proportions. Novel pathogen identification and the detection of certain old pathogens in new regions 
were identified as the major gaps in all existing disease surveillance systems. Major impediments to the One 
Health Scheme proposed by the researchers were identified as government policies, conflicts between 
medical practitioners and researchers from lower income, middle income and better resourced nations. The 
future of disease surveillance as highlighted by the researchers is the genomics-informed [8] one where all 
the factors affecting health are considered on a global scale with complete and uninhibited transparency. 

S.J. Yan et al explored the accuracy and timeliness of data mined from unstructured sources on the 
internet for Epidemic Intelligence (EI). The researchers surveyed a lot of publications on the subject of Text 
Mining for Epidemic Intelligence and came up with the conclusion that serious attention should be paid to 
the timeliness and accuracy of information about disease outbreaks got from mining ubiquitous, free and 
unstructured data from the internet [9]. 

Simon Pollet et al focused on the use of “Big Data” to get Epidemic Intelligence (EI) on vector- 
borne diseases (VBD) in middle and low income countries [10]. The research was a review on the 
performance of various internet-based tools and techniques that have been employed to mine data on vector- 
borne disease. The researchers carried out an in-depth survey and came to the conclusion that more reviews 
need to be done to ascertain the true impact of using “Digital Epidemiology” in tandem with more 
conventional or traditional means of disease surveillance. The research also called for more surveys on the 
reaction of end users to the metrics used to classify or gauge the outbreak of a disease. The researchers also 
emphasized that “Digital Epidemiology” was not made to replace but rather complement traditional methods 
of vector-borne disease surveillance. 

Eun Kyong Shin et al looked closely at the progress of online clinical trials in the United States of 
America. The research focused on the popularity and impact of online clinical trials and health studies from 
the first time it appeared online and its perceived future. The research work also exhaustively detailed the 
potential and obvious use of the internet for health studies. 

Natalie S. and Collins A. in the article, Web-based Surveillance of Illness in Childcare Centers, 
made a proposal for active bio-surveillance in childcare centers [11]. The authors were of the opinion that 
monitoring childcare centers for disease outbreak was more effective than the traditional method of 
monitoring schools only. The article pointed out summer breaks as one of the main or core reasons why 
monitoring schools only for disease outbreaks was not effective enough. The authors also proposed that the 
bio-surveillance of childcare centers be web-based and should submit reports on a weekly basis to the central 
public health department. Some of the key metrics or data that was monitored by the proposed child care bio- 
surveillance program were children categories (toddlers, infants and pre-schoolers) and the illness reported. 
The statistical report was expressed in terms of percentages and actual whole numbers for each major 
category. The authors claimed that the system, implemented in a Michigan County (United States of 
America), was able to detect the outbreak of Gastroenteritis and Hand-Foot-Mouth disease when the more 
conventional school based disease monitoring system was not available, especially during winter and summer 
seasons. 

The Global Public Health Intelligence Network (GPHIN) was credited with sending the first alert on 
the Acute Respiratory Illness Outbreak, code-named, MERS-CoV (Middle East Respiratory Syndrome 
Coronavirus) [12]. It is a web-based program that uses specialized algorithms to harness the power of Big 
Data to mine for clues that signal the onset of an epidemic. The web-based program, in conjunction with a 
multilingual and multidisciplinary team, culled and analyzed information from over thirty thousand sources 
in nine languages for potential clues to the onset of an epidemic anywhere in the world. The authors made it 
known that the system is being adopted by many nations for national disease surveillance. The authors 
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emphasized how future GPHIN projects plan to utilize more of the power of Big Data especially from social 
media outlets using sophisticated algorithms to mine for epidemic clues. 

The use of drivers of emerging infectious diseases [13], [14] was suggested by Sarah H. Olson et al 
to develop the framework for digital detection of Infectious Disease (ID) events. The researchers were of the 
opinion that close monitoring of infectious disease drivers could provide a viable means for the early 
detection of potential infectious disease epidemics especially in the case of emerging infectious diseases. The 
researchers identified some of these drivers as climate and meteorological data. The researchers also 
presented a sample framework for the use of Infectious Disease (ID) drivers in digital disease surveillance 
programs. An extensive review of previous infectious driver models was also done and the gaps were 
identified. 

Nsoesie et al carried out an extensive review of the most recent digital technologies that have been 
employed for infectious disease surveillance at mass gathering events. Interestingly, among the digital 
technologies was the internet or web-based approach to disease surveillance at mass gathering event [15]. 
Notable among the web-based digital disease surveillance for mass gathering events was the Healthcare 
Electronic Surveillance Network (HESN) implemented by the Kingdom of Saudi Arabia to closely monitor 
respiratory, gastrointestinal, cardiovascular, skin and ear/nose diagnoses. The system was said to have been 
very effective during the 2013 Hajj season. The HESN captures diagnostic data from healthcare practitioners, 
clinic and hospital staff, paramedics and other health related outfits for semi-automatic analysis and prompt 
decision taking. In the 2002 Salt Lake City winter games, a web-based and fully automated infectious disease 
surveillance system was used to analyze health data from several sources and most prominent was the 
triggering of an alert if any disease outbreak is suspected. The system actually gave two alerts for respiratory 
infections that were promptly put addressed by health officials. The impact of web-based epidemic alert 
systems during many other mass gathering events like the world cup and religious gatherings were 
elucidated. Small gatherings were not left out. A good example given was the use of a combo of the Global 
Public Health Intelligence Network (GPHIN) and Medical System (Medisys), all web-based disease 
surveillance systems, for the infectious disease surveillance during the 2012 European football championship. 

Some researchers [16] carried out a systematic review to determine the extent and depth to which 
Online Social Networks (OSNs) have been used for disease surveillance. The study submitted that lots of 
models, framework and systems for disease surveillance using online social networks have been developed 
and in many cases, implemented. The researchers acknowledged the fact that online social media provides a 
viable means of tracking pandemics because of its vast and varied, though unstructured nature. The large 
population of people from various places all-over the world and the exchange of information that goes on 
unabated via social media platforms have been mined and analyzed by an array of complex algorithms and 
computational linguistics to track pandemics. The criteria used for each OSN pandemic tracking system were 
numerous and made it clear that the use of online social networks to track the onset or progress of a pandemic 
may never replace traditional and more conventional methods of disease surveillance. 

The need for web-based disease surveillance is also being explored and implemented by the armed 
forces as seen in the joint bio-surveillance portal championed by the Republic of Korea and the United States 
of America [17]. 

The use of purely traditional means of disease surveillance and the emergence of the WHO as the 
international instrument for the expediting of intervention programs in the event of unusual and especially 
tough epidemics and pandemics has led to the disappearance of a number of infectious diseases around the 
world [18]. But emphasis today keeps shifting towards real-time disease surveillance [19]. The report 
submitted that the state-of-the-art for real-time disease surveillance depends heavily on social media. 

Jihye choi et al also reviewed various web-based infectious disease surveillance systems that have 
been used. The study focused on the current state-of-the —art, benefits and challenges associated with web- 
based disease surveillance systems that have been implemented in various ways to support the more 
conventional or traditional surveillance methods [20]. The authors looked closely at the strengths and 
weaknesses of eleven web-based surveillance systems and gave comments on how the weakness of some of 
the web-based surveillance methods already in use, can be improved. The researchers did submit that web- 
based disease surveillance methods were adaptable, low-cost, and intuitive and can be operated in real-time. 
The researchers identified privacy issues, prediction and an interpretation inaccuracy as some of the potential 
challenges of internet-based epidemic alert systems. The authors also noted the absence of a functional web- 
based epidemic monitoring systems in some nations with advanced information and communication 
technology presence. The authors also classified standard disease surveillance systems and a lot of them were 
web-based. This is shown in Figure 1. 
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Standard disease surveillance system logics 


Event-based surveillance system Indicator=based surveillance system 


Data obtained from events real-time or 
indirectly from reports transmitted through 
various communication channels [24] 


Data reported by healthcare providers and 
diagnostic laboratories, collected by surveillance 
specialists in governmental health agencies [24] 


Moderated Systems: Automatic Systems: News Aggregators/Other: 


ProMED-Mail EpiSPIDER = Google Flu Trends 
GPHIN HealthMap - Influenzanet 
GOARN EpiSimS 

RioCaster MediSys 


GETWELL 


Figure 1. Classification of standard disease surveillance systems [20] 


A research work carried out during a project backed by the European Commission, acknowledged 
the high outlay of capital on electronic disease surveillance systems to help in detecting the outbreak of 
emerging and re-emerging infections on time. The report submitted that it remains unclear if existing and 
sophisticated real-time electronic surveillance systems can effectively detect the outbreak of an epidemic 
early. 

The Sustainable Surveillance Workgroup made some suggestions on how to build a sustainable 
disease surveillance system [22] that is equipped to provide information about infection outbreak 
continuously. The report supported the need to know that continuous and unabated disease surveillance is a 
must for the benefit of public health. The report also stressed the need to improve on the monetary allotment 
for surveillance purpose, have an active surveillance workforce and delve into deep rigorous disease 
surveillance research that would lead to a better understanding of public health and help with the creation of 
policies and decision making. 

The lessons learned from the various implementations of web-based disease surveillance systems 
can be seen in the article written by MO Lwin et al of how Mo-Buzz [23], a mobile pandemic surveillance 
system for Dengue was implemented in Colombo, Sri Lanka. The mobile application was developed to take 
advantage of Sri Lanka’s large mobile device using population. The study submitted that the traditional 
Dengue reporting structure in Sri Lanka was excruciatingly slow because it was still paper based. The 
introduction of Mo-Buzz in two phases, one for the general public and another for Sri Lanka’s health 
institutions, led to a boost in the country’s ability to detect, keep track and inform the public about Dengue 
disease outbreaks. The researchers noted that though Mo-Buzz’s initial uptake was quite low, it picked up 
and went as high as 76%. This study confirms the fact that mobile and social media outlets which are all 
web-based are the future for global disease surveillance [24]. 


2.3. Traditional and syndromic surveillance of infectious diseases and pathogens 

Cedric Abat et al noted that many disease surveillance systems are in use all around the world [25]. 
The reasearchers made a summary of some disease surveillance methods in use all around the world. The 
researchers looked at syndromic surveillance from the microbiology perspective. The researchers submitted 
that disease surveillance data can be gathered from the Human Environment with focus on Environmental 
data (water pollution, weather, and air pollution) and Animal Health data (information about the health of 
domestic and wild animals). Surveillance data can also be got from human behaviour, which consists of 
Internet use (web queries, press dispatches, social media, press articles), Telephone (hotlines), Drug sales and 
Absenteeism. Health Care also provide viable disease surveillance data via Sentinel surveillance (sentinel 
physicians who agree to notify the public health authorities at regular intervals of patients presenting certain 
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specific symptoms of infectious diseases on the watch list), Chief complaints, Medical records, Hospital 
discharge data, Microbiology orders, Disease reports and Demographics. 

The researchers listed some surveillance strategies such as Disease-specific surveillance, Event- 
based surveillance and Syndromic surveillance [26]. 


2.4. Infodemiology metrics 

Infodemiology helps deternine the best way to tackle the issue of public health. It is possible to 
collect data for this information in real-time. Internet queries have been used to predict the outbreak of an 
epidemic. Twitter microblogs, the news and the way people use the internet for health services have been 
monitored. The information gleaned from all these numerous sources are analysed and useful information 
that can be used to inform about health policies are inferred [27]. A lot of metrics have been used to gauge 
the impact of information got from the internet. 

It has been said that there is a need to standardize infodemiology and inforsurveillance metrics. 
Infodemiology is primarily and electronic (got using some kind of algorithm). According to the authors, 
infodemiology’s most basic metrics could be supply related (internet users postings) or demand related 
(internet users buying habits) [28]. On the supply side, the most basic metrics were information prevalence 
and information occurrence ratios. On the demand side, the most basic metrics were the number of searches 
of a specific topic and number of clicks on a website about a specific topic. An active method involving 
online surveys of consumers of health products also provides for a good infodemiology metric. 


2.5. Models for web-based surveillance and epidemic alert systems 

The ubiquitous and pervasive powers of the internet and social media have forced many states to 
revise their disease surveillance policies. It is almost a must for states to carry out active disease surveillance 
and inform the World Health Organization (WHO) of any epidemic outbreak. Individuals and non- 
governmental organizations have continued unabatedly to use the power of social media, the internet and 
complex algorithms to report cases of epidemic outbreaks to WHO and in many instances, before it is known 
and accepted officially [29]. The model is shown in Figure 2. 


Public health event detected by surveillance system 


Human cases Events of cases of 


of smallpox, potential national s s that have 
polio (wild- or international a proven ability to 
type), SARS, public health cause national or 
and influenza concern international 

(new subtypes) public health 


concern 


Serious public 
health impact? 


Unusual or 
unexpected event? 


Significant risk of 
international spread? 


restrictions? 


Yes to at 
least 2 of the 


4? 


Report event to WHO | 


Figure 2. Decision-making instrument for International Health Regulation - IHR (2005), adapted from 
Annex 2 [29] 


WHO has an integrated global alert and response system for epidemics and pandemics. The system 
is based on already existing and effective national health systems and an international coordinated response 
system. Presently, WHO get alerts about epidemics and pandemics through the health care system of its 
member states? The model uses phases of this nature for zoonotic diseases; 
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Phase 1-3: The phase for preparing, gathering up-to-date information and planning for emergency 
response. 

Phase 4-6: Actual emergency response and mitigating efforts. 
In a more detailed format, the various phases have these components: 

Phase 1: No human infections. Only animals are affected. 

Phase 2: Humans have been infected but at very low to non-existent levels. 

Phase 3: A tangible number of humans infections have been recorded but no human-human 
transmissions yet. 

Phase 4: Verified and recorded human-human infection with attendant “community-level outbreaks” 
or full epidemic. 

Phase 5: Verified and recorded human-human infection with attendant “community-level outbreaks” 
in at least two countries within a defined WHO region. Pandemic is very imminent. 

Phase 6: Verified and recorded human-human infection with attendant “community level outbreaks” 
in at least two countries within a defined WHO region and at least one country in a different WHO region. A 
pandemic is underway. 

Presently WHO has made it necessary for all national governments to report cases of any disease on 
a watch list that keeps growing with new additions. 

A Bayesian Hierarchical Poisson Model with a hidden Markov model was proposed by 
D. Conesa et al for the early detection of influenza epidemic outbreak [30]. The model relied greatly on an 
intensity parameter that was set by the incidence frequency. The incidence rate was considered as a normal 
distribution in which its parameters, mean and variance, were modeled to reflect the phase of the system, be 
it epidemic or non-epidemic. The transition took into cognizance previous weekly epidemic states. The 
authors gave samples of how to implement the statistical model and used Bayesian Inference to define the 
state of an influenza epidemic at any moment. The researchers gave the transition probabilities as: 


P(Zi41; = L/Zi jon) = Pua, KL € {0,1} 
Where; 


Zij > An observed random variable that indicates the phase of the modeled system as either 
epidemic (1) or non-epidemic (0) 

Py, > Suitable probabilities 

i + Day (during the week) 

j > Season 


The Moving Epidemic Method (MEM) was used to model the incidence rate of influenza-like 
illness (ILI) and Acute Respiratory Illness (ARI) for some European countries by [31]. The values obtained 
were used to compute the various intensity levels adopted by the research, namely, Baseline, Low, Medium, 
High and very High. The researchers used these benchmarks to compare the epidemic level of ILI in various 
European nations for different time periods from 1996/1997 — 2013/2014 seasons. The authors arrived at the 
conclusion that these comparisons are important for firm understanding of seasonal epidemic patterns and 
thus, should be incorporated into automated disease surveillance systems at national and international levels. 

In the article, Zika Virus: A New Pandemic Threat [32], allusion was made to a special software 
application, Zika Tracker, that was used to aid voluntary reporting of confirmed Zika virus infection cases to 
help the Americas contain the spread of the virus which was suspected to be the main cause of an alarming 
rise in Microcephaly cases in the Americas. This is one of the very basic examples of a disease alert and 
monitoring system that exploits disease surveillance at the grassroots. 

Ruth A. Ashton et al, gave an insightful expository into the usefulness of school-based disease 
surveillance with malaria as a case study. [33]. The research focused on a pilot programme that was carried 
out in Ethiopia to monitor malaria epidemics and focused particularly on school absenteeism and febrile 
illness. The researchers submitted that a lot of challenges hampered the study. The focus on a school-based 
system brought a serious challenge of population representation as almost 46% of Ethiopian school aged 
children are not enrolled in school. The researchers suggested that another pilot project be carried out again 
when there is a substantial increase in reported cases of malaria than what is conventionally known. In all, the 
researchers noted that the sensitivity of the school-based syndromic surveillance to detect epidemics could 
not be fully ascertained. The Model used is shown in Figure 3. 

Four critical areas that pose serious problems to disease surveillance on a global scale were outlined 
by [34] as scientific methods, international policies, technical resources, financial resources and human 
resources. The researchers gave a notional scheme for a global disease surveillance and response process. 


Epidemic Alert System: A Web-based Grassroots Model (Etinosa Noma Osaghae) 


3816 OD ISSN: 2088-8708 


Some researchers [35] wrote on how epiDMS: Data Management and Analytics for Decision 
Making from Epidemic Spread Simulation Ensembles, have helped to plug some critical holes that has to do 
with scalability, multiple interdependency parameters and complex dynamic processes during an ongoing 
epidemic. The researchers claimed that the data management and analytics tools offered by epiDMS help 
with the decision-making process in the event of an epidemic with significant health and economic benefits. 
Figure 4 depicts the epiDMS model. 


PHASE 1 PHASE 2 


Minor transmission season: 
March - May 2012 


COMMUNITY SURVEYS | | ‘om i 
COMMUNITY SURVEYS 10 sites 10 sites 


6 sites, 8 repeat surveys 


Major transmission season: 
October - December 2012 


Select PILOTED INDICATOR: PILOTED INDICATOR: 

indicators Reported symptoms & Weekly % absence from 

SCHOOL SURVEYS for phase 2 reported attendance of school using attendance 
6 sites, 8 repeat surveys piloting schoolchildren registers 


HEALTH FACILITY DATA HEALTH FACILITY DATA HEALTH FACILITY DATA 
Health centre & health post Health centre & health post Health centre & health post 
weekly malaria burden weekly malaria burden weekly malaria burden 


SCHOOL ATTENDANCE SCHOOL ATTENDANCE SCHOOL ATTENDANCE 
Copies of existing registers Copies of existing registers Copies of existing registers 


Figure 3. Model for school-based malaria epidemic surveillance system [33] 
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Figure 4. Overview of the epiDMS system [35] 


A group of researchers developed an internet-based epidemic alert system for periodontal disease in 
Nigeria. The web-based model proposed by the researchers was based on real-time statistical data for 
periodontal disease diagnosis across Nigeria. HTML, PHP and CSS were used to develop the user-friendly 
interface of the system and MySQL [36] was used to create the database of the system. The researchers 
claimed that the proposed system will help with the surveying and tracking of periodontal disease in 
Nigeria [37]. 

Some researchers[38] used colour code based on the alert phases already defined by WHO to 
determine and raise an alarm for the outbreak of the AHINI Influenza in America. The authors used the 
Basic Reproduction number R, to know when there is a need to trigger an epidemic alert. The authors 
utilized accumulated data from sixteen (16) out of the thirty five (35) member states of the Americas to get 
the Basic Representation number. A Basic Representation number greater than one (>1) was the trigger for 
the outbreak of an epidemic. 
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3. EPIDEMIC ALERT THRESHOLD ALGORITHMS 
3.1. The use of change point analysis 

A change point is the point where a structure change occurs in the collected data [39]-[ 40]. The 
series can be represented as: 


E ee sa] 
And the index of time, represented as: 
T € {1,2,...,n} 


The epidemic or endemic component of the process is a piecewise constant. The pre-epidemic 
period (endemic state), epidemic period (epidemic state) and the post-epidemic (endemic state) would be 
determined. A changes are first detected, counted and estimate. If {x1, X2, ..,Xn} is the time series of 
independent variables and 6;, where i = 1,.....,n represent the corresponding structure parameters then, a 
decision has to be made between: 


H: 01 3% =O, ..= 6, = = = 0, No change point 


And 


Ay:0, == On = a # Opa, = Or = B = O74, = +: = O, = y Change points 
Note: 


a 1l<k<t<n 
b. a,b andy represents the start and end dates of the outbreak respectively 
c. The rejection of H, confirms a change point. 


If H, is rejected, the number of changes in state and their actual position has to be estimated. Thus: 

If H, is true, what is k and t from the sample, {ex «dial This change point problem was 
solved using the non-parametric kernel model. Based on simulated data, the non-parametric kernel model 
was used to detect the start of an outbreak and the end of an outbreak. 


3.2. The kernel model 
If ai rd Xn} is a true series of independent random variables, the Kernel function is defined as: 


Y; = K(x;) vi € {1,2,...,n} 


A Kernel Fisher discriminant ratio (KFDR) is used to measure the heterogeneity between successive 
segments, S1, S2, S3. 
S = 16,56; na xi} with i observations, pre-epidemic. 
So = {x,141, X,142, DA ,%;} with (j — i) observations; epidemic. 
S3 = {x,2+1,X,2+2, ...,Xn} with (n — j) observations; post-epidemic. 
A simple linear kernel function: 


k(x,y) = xy is used to determine the value of "k" 
To find the KFDR between S, and S}: 


mean(S;)-mean(S2) 
dl var (Sy, 5 var (S2) 


KFDR(S,, S2) = 


i and j are chosen to maximize the heterogeneity between the three segments by calculating: 
VCL j) = ED KFDR(S,,52) + OP KDR (Sp, 53) 


(T T2) = argmax (Tı, T2) € {1, n} æ {1, ..., Nn}, T4 < T3 (KPDR (yo) 
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4. RESEARCH METHOD 

The web-based grassroots model assumes a medical history and uses the third quartile value for 
each disease under consideration to set the threshold that would determine when a disease has reached 
epidemic levels. The proposed system uses the weekly percentile. 

The World Health Organization suggested the use of the third quartile (75" Percentile) as a 
threshold for triggering the onset of an epidemic [41]. 


Ny — C 
P= L+ [PE] 


i 


P; > The it” percentile 

L + Limit below the the desired percentile point’s interval 

n > Total available scores 

p — The score point in terms of desired percentile 

C; > Summation of frequency scores below the percentile point’s interval 
fi > The i!" percentile poin’s frequency scores 

w > Class interval width 


A baseline or threshold is set, beyond which a disease has reached epidemic levels. The ‘Low” 
alarm is triggered when the seventy fifth percentile value (third quartile) for a given disease is higher than the 
baseline at any given week. The “Moderate” alarm is triggered when the third quartile value of a given 
disease is higher than the baseline for two consecutive weeks. The “High” alarm is triggered when the third 
quartile value of a given disease is higher than the baseline for three consecutive weeks. The “Severe” alarm 
is triggered when the third quartile value of a given disease is higher than the threshold for four consecutive 
weeks. 

On the user side, data collected using the proposed web-based grassroots model includes: 
1. Medical: Centre Code, Age, Sex, Symptoms, Diagnosis (50 ICD codes were used in this demonstration). 
2. Microbiology Orders: Laboratory Centre code and Pathogen. 
3. Notifiable Disease Report: Express Notification by Sentinel Surveillance. 
On the administrator side, data monitored include: 
1. Dashboard: Special Alerts and Express Notification by Sentinel Surveillance. 
2. Percentile graphs: Daily and Weekly Percentile Graphs. 

The International Classification of Diseases (ICD-10) codes and their respective threshold based on 
previously known (assumed in this case) weekly third quartile values for the sample diseases monitored by 
the proposed web-based grassroots model are shown in Table 1. 


Table 1. ICD Codes and their Third Quartile Baseline 


S/N Disease ICD Code Third Quartile 
Baseline 
1 Cholera* A00 5 
2 Plague* A20 0 
3 Yellow Fever* A95 0 
4 Small Pox* B03 0 
5 Relapsing Fever* A68 0 
6 Typhus* A75 15 
7 Polio* A80 0 
Severe Acute 
8 Respiratory 160 30 
Syndrome 
(SARS)* 
9 Ebola virus A98-4 0 
disease* 
10 Influenza* J10 0 
11 Lassa Fever* A96-2 0 
Marburg 
12 Hemorrhage A98-3 0 
Fever* 
13 Rift Valley Fever* A92-4 0 
14 Tularemia* A21 0 
Dengue 
15 Hemorrhagic A91 0 
Fever* 
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S/N Disease ICD Code Third Quartile 
Baseline 
Crimean-Congo 
16 Hemorrhagic A98-0 0 
Fever* 
17 Anthrax* A22 0 
18 Monkeypox B04 12 
19 Candidiasis B37 10 
20 HIV/AIDS B20 20 
21 Diarrhea A09 30 
22 Tuberculosis A16 50 
23 Rabies* A82 0 
24 Botulism A05-1 23 
25 Campylobacteriosis A04-5 15 
26 Chickenpox B01 35 
27 Creutzfeldt-J akob A81-0 20 
Disease 
28 Dysentery A06-0 12 
Hantavirus 
29 Pulmonary 126 11 
Syndrome 
30 Helicobacter Pylori K31-2 5 
31 Hepatitis B B16 12 
32 Hepatitis C B17-1 12 
33 Histoplamosis B39 2 
34 Leptospirosis A27 35 
35 Lyme Disease A69-2 13 
36 Measles B05 15 
37 Mumps B26 15 
Typhoid and 

sh; Paratyphoid Fevers DO 70 
39 Diptheria A36 48 
40 Schistosomiasis B65 50 
41 Tetanus A33 10 
42 Taxoplasmosis B58 25 
43 Leprosy A30 10 
44 Viral meningitis A87 10 
45 West Nile Virus A92-3 10 
46 Dyspepsia K30 25 
47 Hepatitis A B15 60 
48 Whooping Cough A37 5 
49 Malaria B50 100 
50 Scabies B86 35 


*The asterisked diseases on a special watch list 
NOTE: The International Statistical Classification of Diseases (ICD) — 10 codes used herein does not take into 
cognizance subsets of the code for disease variations and causative organisms. 


For demographic analysis, codes were assigned to hospitals, primary health centres and laboratories that 
provide inputs to the epidemic alert system. Table 2 shows some sample centre code for hospitals, 
laboratories and primary health centres that were used to demonstrate how the proposed epidemic alert model 
works. Table 3 shows a sample of the inputs got from Laboratories. Table 4 shows a sample of the inputs got 
from medical records in hospitals and primary health centres. Figure 7 shows the algorithm for the proposed 
web-based grassroots epidemic alert system. 


Table 2. Centre Codes 


S/N Centre Code Type Location 
1 H01865360875 Hospital Marque, Kingston 
2 P12765098656 Primary Health Centre Dale, Lofty Heights 
3 P37659339059 Primary Health Centre Tomahawk, Prowess 
4 L78599539584 Laboratory Balinese, Catwalk 
5 H24698736483 Hospital Cross, Time Hills 
6 H54786783995 Hospital Action Yard, Trent 
7 P84898479948 Primary Health Centre Hague, Bella vane 
8 H67494672997 Hospital Seminary, Zone; 
Primer 

9 L75643782674 Laboratory Hebron, Simile 
10 P98573652641 Primary Health Centre Bayville, Manama 
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Centre Code Pathogen 


Test 
counts 


Vibro Cholerae Plasmodium 
H01865360875 Spp 
Salmonella Spp 
Vibro Cholerae Plasmodium 
P12765098656 Spp 
Salmonella Spp 
Vibro Cholerae Plasmodium 
P37659339059 Spp 
Salmonella Spp 
Vibro Cholerae 
L78599539584 Plasmodium Spp 
Salmonella Spp 
Vibro Cholerae 
H24698736483 Plasmodium Spp 
Salmonella Spp 
Vibro Cholerae 
H54786783995 Plasmodium Spp 
Salmonella Spp 
Vibro Cholerae Plasmodium 
P84898479948 Spp 
Salmonella Spp 
Vibro Cholerae Plasmodium 
H67494672997 Spp 
Salmonella Spp 
Vibro Cholerae 


L75643782674 Plasmodium Spp 
Salmonella Spp 
Vibro Cholerae 
P98573652641 Plasmodium Spp 


Salmonella Spp 


5 


Table 4. Medical Records 


Centre Code Age Sex 


Diagnosis (ICD-Code) 
- Counts 


H01865360875 0-1 (Infant) Male 


Female 


1 — 4 (toddler) Male 


Female 


5 — 12 (child) Male 


Female 


B50 — 23, A01-4, A00 
— 21, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 
0, 


B50 — 5, A00— 0, B16 
— 7, B37 — 1.B50 - 23, 
A01-3, A00 — 12, BOS 
— 0, BO1-2, A09 -0, 
A80 — 3, B86 — 2, B50 
— 5, A00 — 0, B16- 1, 
B37-0. 


B50 — 23, A01-7, A00 
— 21, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 
— 2, B50 - 5, A00-0, 
B16-7, B37 - 0. 


BS0 — 22, A01-4, A00 
— 21, BOS — 1, B01-3, 
A09 -1, A80 — 3, B86 
— 2, B50 - 5, A00-0, 
B16—7,B37-0. 


B50- 15, A01-4, A00 
— 21, BOS — 1, B01-2, 
A09 -1, A80 — 5, B86 
— 2, B50 - 5, A00- 5, 
B16-7,B37-1. 


B50 — 20, A01-4, A00 
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Diagnosis (ICD-Code) 
- Counts 

— 21, BOS — 1, B01-2, 

A09 -1, A80 — 3, B86 

— 2, B50—5, A00- 1, 

13 — 17 (Teenager) Male B16—7, B37-1. 


Centre Code Age Sex 


B50 — 22, A01-4, A00 
— 21, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 


— 2, B50 — 5, A00- 5, 
Female B16-7,B37-1 


B50 — 21, A01-4, A00 

— 18, BOS — 1, B01-2, 

A09 -1, A80 — 3, B86 

— 2, B50 - 5, A00 — 3, 
18 — 59 (Adult) Male B16-7, B37-0. 


B50 — 18, A01-4, A00 
— 20, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 
— 2, B50 - 5, A00-4, 
Female B16-7, B37 -0. 


B50- 15, A01-4, A00 
— 16, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 
— 2, B50 - 5, A00-2, 
60 and above Male B16—7, B37-1. 
(elder) 
B50 — 20, A01-4, A00 
— 15, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 
— 2, B50 — 5, A00 — 4, 
Female B16- 5, B37- 1. 


B50 — 20, A01-4, A00 
— 14, BOS — 1, B01-2, 
A09 -1, A80 — 3, B86 
— 1, B50 - 5, A00-2, 
B16 — 7,B37-0. 


4.1. The model for the user interface 
Figure 5 shows the model of the user interface for the proposed web-based grassroots epidemic alert 
system. 


4.2. Model for administrator interface 
Figure 6 shows the model of the administration interface for the proposed web-based grassroots 
epidemic alert system. 
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Gra phical User Interface 


Medical Records Microbiology Orders 


Notifiable Disease 
Report 


Pathogen 


Diseases on 
Watch List 


Undiagnosed 
Disea ses 


Centre Code | Age | | Sex | | ICD Code 


Database 


Figure 5. Model of the user interface for the proposed web-based grassroots epidemic alert system 


| Epidemic Alerts | 


Alert Level: Alert Level: Alert Level: Alert Level: 
Low Moderate High Severe 


Special Alerts for 
Diseases on Watch List 


Investigate Further 


| Causative Undiagnosed 
Percentile Bar Charts Organism Disea ses 
Gra phs 
Demographics 


Location Sex Age 


Figure 6. Model of the administration interface for the proposed web-based grassroots epidemic alert 
system 


5. RESULTS AND ANALYSIS 

The baseline values for all diseases being monitored is depicted in Figure 8 and Figure 9. It is based 
on the corresponding third quartile value for each ICD code. Some ICD codes like A20, A82, and A80 have 
corresponding baseline or threshold values of zero. This is because they are diseases on the special watch list. 
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It is assumed that any recorded case of the diseases on the watch list should generate swift and appropriate 
action to protect public health. 

In Figure 10, is shown the minimum, first quartile, median, third quartile and maximum values for 
ICD codes, A01, A36, B65, A33 and B58. It can be seen clearly from the plot that none of the diseases under 
observation has exceeded their respective threshold values. A close look at the plot reveals that the third 
quartile value for the observed diseases can be shown in a tabular format as shown in Table 5. 


START 


FIRST WEEK: 
IS DISEASE CASES HIGHER 
THAN THE KNOWN 
THRESHOLD 


IS DISEASE ON THE 


WATCHLIST? NO NON-EPIDEMIC 


YES 


YES 


ALERT: LOW 
SPECIAL ALERT 


SECOND WEEK: 
IS DISEASE CASES HIGHER 
THAN THE KNOWN 
THRESHOLD 


YES 


ALERT: MODERATE 


THIRD WEEK: 
IS DISEASE CASES HIGHER 


THAN THE KNOWN NO 
THRESHOLD 
YES 
ALERT: HIGH 
FOURTH WEEK: 
IS DISEASE CASES HIGHER NO 


THAN THE KNOWN 
THRESHOLD 


ALERT: SEVERE 


Figure 7. Algorithm for the proposed web-based grassroots epidemic alert system 


Epidemic Alert System: A Web-based Grassroots Model (Etinosa Noma Osaghae) 


3824 O ISSN: 2088-8708 


THIRD QUARTILE BASELINE 
THIRD QUARTILE BASELINE 
wu A934 
7 l i 20 # o 80 10 120 
KD COOFS Corresponding 75th Percentile Value 
Figure 8. Baseline graph for each ICD code at Figure 9. Bar representation of weekly 
the 75" percentile (third quartile) thresholds for all fifty (50) diseases 


—+— A01 
40 = 136 
30 B65 
20 13 
10 + 858 


MIN 25th MEDIAN 75th MAX 
Percentile Percentile 


Figure 10. Weekly baseline graph for selected diseases 


The actual third quartile values in Table 5 are all less than the baseline values. Therefore, the 
observed diseases are all non-epidemic. 


Table 5. Comparison between Actual Third Quartile Values and Baseline Values 
ICD Actual Third 


Code Quartile Value Baseline Value Epidemic Alerts 
A01 60 70 Non-Epidemic 
A36 34 48 Non-Epidemic 
A65 36 50 Non-Epidemic 
A23 6 10 Non-Epidemic 
B58 14 25 Non-Epidemic 
A01 60 70 Non-Epidemic 
A36 34 48 Non-Epidemic 
A65 36 50 Non-Epidemic 
A23 6 10 Non-Epidemic 
B58 14 25 Non-Epidemic 


In Figure 11 there is an outlier. The outlier corresponds to the ICD code, B50. The outlier has a third 
quartile value that is greater than the threshold. But this is not to say that a disease has to be an outlier before 
the epidemic alarm is triggered. The only determining factor for triggering the epidemic alert system is the 
disease’s third quartile value at the end of every week. From Table 6 it can be seen that Malaria (B50) has 
reached epidemic levels and a “Low Alert” was triggered 
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MIN 25th MEDIAN 75th MAX 886 
Percentile Percentile | 


Figure 11. Weekly series graph with malaria (B50) at the “low alert” level 


Table 6. Malaria (B50) at Epidemic Levels 
Actual Third 


Disease Quartile Value Baseline Value Epidemic Alert 
K30 18 25 Non-epidemic 
B15 57 60 Non-epidemic 
A37 2 5 Non-epidemic 
B50 150 100 Low Alert 
B86 20 35 Non-epidemic 


From Figure 12, it can be observed that the third quartile values of AO1 for three consecutive weeks were 
above the threshold (baseline) value of seventy (70). This triggered the “High Alert” for Typhoid and 
Paratyphoid Fevers. Table 7 shows this result clearly. 


Third Quartile Values for Three Consecutive Weeks 


100 | et 


—+— A01 


Third Quartile Value 
N 
© 


Week1 Week2 Week3 


Figure 12. Series graph with typhoid (A01) at the “high alert” level 


Table 7. Typhoid and Paratyphoid Fevers (A01) at High Alert 
Disease Week 1 Week 2 Week 3 
AOl 80 120 112 


In Figure 13, the ICD code A00 has third quartile values that exceed the baseline for four 
consecutive weeks. Thus, the “Severe” alarm is triggered for ICD Code, A00. ICD code, A01, though having 
third quartile values that are above the threshold for three consecutive weeks, its epidemic status has been 
changed to non-epidemic because its third quartile value at the fourth week is less than the threshold value. 
This submission is clearly shown in Table 8. 

A special exception is made for diseases that are on the watch list. In Figure 14, A80 has a third 
quartile value of zero (0) but a maximum value of one (1). Being on the watch list, A80 with a maximum 
value of one (1) triggered a “Special” alarm made for diseases on the watch list or diseases that have been 
confirmed eradicated officially. Table 9 sheds light on this special case. 

A demographic analysis gives the percentage of infants, toddlers, children, teenagers, adults and 
aged persons affected by an epidemic outbreak [42]. It also provides information about the location of the 
outbreaks. It may also give an idea of the causative organism. A sample of the demographic analysis of 
Centre Code, H0186536085 for Cholera outbreak is shown in Figure 15. 
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Third Quartile Values for Four Consecutive Weeks 
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Figure 13. Series graph with cholera (A00) at the “severe alert” level 


Table 8. Cholera (A00) and Typhoid Fever (A01) on “Severe” and Non-epidemic Alerts Respectively 


Disease Week 1 Week 2 Week 3 
A00 5 13 16 
A01 70 120 112 

25, — 
=475 

15 MEDI 

10 4 ercentile, 13 = /30 

5 dre J60 
01 H nog 

MIN 25th MEDIAN 75th MAX 
Percentile Percentile 


Figure 14. Weekly series graph with polio on “special alert” 


Table 9. Polio (A80) on Special Alert 


Disease Rieti Baseline Value Maximum Value 
A76 12 15 13 
A80 0 0 1 
J60 20 30 21 

A98 - 4 0 0 0 


Confirmed Cholera Cases In Centre; (401865360875 - Maque, Kingston) 


5 
E Male 
E Female 
1 
0 


Infants Toddlers Children Teenagers Adults Elders 


> 


w 


n 


Figure 15. Sample demographic analysis 


5. CONCLUSION 
This proposed web-based epidemic alert system is a first-line step to standardizing web-based 
disease surveillance systems. It is a grassroots model that leverages on simplicity and “traditionality” as 
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against complexity and “sophistication” to give timely epidemic alerts that can be disseminated to various 
stakeholders for appropriate action. 


7. FURTHER WORK 

A model for grading the severity of an epidemic and the corresponding response for each stage 
would be critically considered in the future. A more comprehensive and effective algorithm for epidemic 
detection would also be developed. 
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